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Abstract —Inverting the hash values by performing brute force 
computation is one of the latest security threats on password 
based authentication technique. New technologies are being 
developed for brute force computation and these Increase the 
success rate of inversion attack. Honeyword base authentication 
protocol can successfully mitigate this threat by making 
password cracking detectable. However, the existing schemes 
have several limitations like Multiple System Vulnerability, 
Weak DoS Resistivity, Storage Overhead, etc. In this paper we 
have proposed a new honeyword generation approach, identified 
as Paired Distance Protocol (PDP) which overcomes almost all 
the drawbacks of previously proposed honeyword generation 
approaches. The comprehensive analysis shows that PDP not 
only attains a high detection rate of 97.23% but also reduces 
the storage cost to a great extent. 

Keywords — Authentication; Password; Inversion attack; Hon¬ 
eyword; Paired distance. 

I. Introduction 

Password based authentication technique is one of the most 
widely used authentication technique as it nicely balances 
the security and usability standards. However, like any other 
security schemes, password based schemes have also been 
challenged by different attack models over times. One such 
recently developed attack model is inversion attack and the 
model is described next. 

A. Inversion attack model 

While creating a web-account, user has to register with the 
site by submitting the username and password. System stores 
the username in plain text whereas the password is converted 
into hash (may be with an added salt) using hashing algorithm 
H. Thus, the user’s login credential — stored by the system, 
can be represented by a tuple < Ui,H{pi) >. Under inversion 
attack model, adversary can successfully invert the hashes 
(evaluating pi from H(pi)) from the compromised password 
file F. While inverting the hashes, adversary first derives a 
password string using some existing techniques 12, ea, m- 
Then adversary matches the password string (appending the 
salt, if required) after converting it into hash value using H. If 
the obtained hash value gets matched with stored hash value 
then adversary becomes successful in inverting the hashes. 

Initially, brute force attack was conducted by guessing 
many possible combinations to break a password. But time 


complexity using this approach used to be very high as attacker 
tries for every possible options for cracking a password. 
One of the publicly available password cracking algorithm 
which significantly reduces the time complexity of inversion 
attack — was proposed by John and Ripper in 2008 QJ- 
In 2009, based on the concept of probabilistic context free 
grammar, Weir et.al. were able to crack 28% — 129% more 
passwords than Jhon and Ripper password cracking technique. 
Recently proposed technique by Ma.et.al ITSl — uses Markov 
chain model for password cracking and shows signihcant 
improvement over proposed algorithm by Wier et.al. 

Evidences : There are some strong evidences of inversion 
attack which threats security of some reputed web based 
organizations. In recent past, almost 50 millions passwords 
of Evernote have been compromised m by performing 
inversion attack. Giant web-service based organizations like 
Linkedin, Yahoo, Rock You have gone through the same misery 
Kn. So there is an urgency for developing an improved 
honeyword based framework, robustly handles the inversion 
attack. 

B. Existing security techniques 

Eew security techniques have been developed to address 
this security issue. There are some tricks using which user’s 
password can be transformed into some hash value which 
is harder to invert. This type of login set up increases the 
login time and does not make successful password cracking 
detectable lED- Another alternative may be — setting up few 
fake login accounts by the administrator. An adversary, who 
successfully inverts the hash value of any such account, system 
detects the security breach. But with some careful analysis, 
adversary can distinguish the real usernames from the system 
generated usernames m- 

Honeyword based approaches have shown some signihcant 
potential while providing security against inversion attack. Us¬ 
ing this approach system maintains a list of passwords which 
contains the real user’s password along with some system 
generated passwords, known as honeywords. System generates 
these honeywords by using any of the underlying honeyword 
generation algorithms such as - take-a-tail lfT4l . modelling- 
syntax Q etc. Once password hie F is compromised and 
adversary enters any of the honeywords from the password 


list of Wi, system identifies the attack and takes necessary 
actions depending upon the security policy. 


C. Motivation and Contribution 


Among all the honeyword generation techniques proposed 
so far — take-a-tail approach (see details in Section II-CI sets 
strongest security standard among all m. But the technique 
threats usability standard to a great extent as user with n 
different login accounts, has to remember n different system 
generated information as a part of his login credential. More¬ 
over, we have found that all proposed honeyword generation 
techniques require to store k-1 (k > 1) honeywords to lure the 
attackers. Storing k-1 extra information for each username, 
magnifies the storage cost to a great extent. 

Thus, from the existing literature surveys, we gist the 
motivations behind this work and those are summarized as 
below — 


• Motivation 1 : All existing honeyword generation 
techniques store k-1 decoy passwords to detect the 
security breach. Thus, the storage cost is required to be 
minimized which increases with the number of users. 


• Motivation 2 : Till date, though “take-a-tail" sets highest 
security standard but the method threats the usability 
standard badly as user requires to remember n different 
system generated information for n different accounts. 
Like in cni, we also feel that remembering n different 
informations for n accounts is infeasible for most of the 
users due to limitation of human memory ll20l . Thus, 
a honeyword based security architecture is needed to be 
developed which ensures same security standard as “take- 
a-tail" by enhancing the usability standard. 

Motivated by the above mentioned facts we have made 
following two major contributions in this paper — 

• Contribution 1 : We propose a new method termed 
as Paired distance protocol (PDF) for generating 
honeywords. The method stores only a single information 
to generate the honeywords and thus, minimizes the 
storage burden signihcantly. 


in Section VII In Section VIII we give a brief outline of 


existing work in this direction. Finally we conclude and give 
some future directions of our work in Section |I2 


II. An overview on honeyword based 

AUTHENTICATION TECHNIQUE AND IT’S LIMITATIONS 

In this section, first we describe the working principal 
of honeyword based authentication scheme. There after we 
present limitations of existing schemes, proposed in this di¬ 
rection. But prior to that some of the related notations that we 
are going to use, are presented in Table 


Notations 

Meaning 

Ui 

user in system 

Pi 

password of user 


tuple of passwords stored for Ui 

k 

number of elements in Wi 

Ci 

index of correct password in Wi 

sweetword 

each element of Wi 


TABLE I: Related notations 


A. Honeyword based authentication technique 

As mentioned in Section |I-B[ a honeyword generation 
scheme maintains a list Wi against each username Ui. The 
index of correct password is maintained in another hie in a 
different system (known as “honeychecker"). The basic idea 
is — even if Wi is compromised and adversary successfully 
inverts each sweetword then also adversary gets confused 
about original password of user as user’s complete pass¬ 
word information is distributed over two different systems. 
If adversary picks any sweetword from the index of list 
Wi and submits that against user id Ui then index of that 
sweetword (li) is directed to the “honeychecker". If li gets 
matched with Ci, honey-checker directs a positive feedback 
to the system administrator otherwise, “honeychecker" directs 
a negative feedback. Depending upon the security policy, 
system administrator takes necessary actions according to the 
received feedback from the “honeychecker". Thus,honeyword 
based system provides distributed security which is harder to 
compromise as a whole M. 


• Contribution 2 : Using the proposed technique, users 
require to remember only a single information (of their 
own choice) to maintain n different accounts . Thus, 
instead of remembering n system generated different 
information for n different accounts (like in “take-a-tail"), 
user only remembers a single information and still can 
avail the same security standard as “take-a-tail". 

Roadmap — In Section we give an overview and 
limitations of existing honeywords generation algorithms. In 
Sectionjl^we introduce the proposed PDP approach and show 
how proposed scheme works to detect the attack? Security 
and usability analysis of PDP is illustrated in Section |IV] 
and Section |V] respectively. In Section |V^ we show how 
PDP minimizes the memory overhead? A detailed comparative 
analysis of PDP with existing security techniques is provided 


B. Limitations of honeyword based authentication technique 

Though existing honeyword based approaches can provide 
security against brute force attack but they have few limita¬ 
tions. The limitations are described below — 

(a) Storage overhead — Using honeyword generation 
approach, system needs to store k-1 more passwords for each 
user account. Thus for a system storing n users accounts, 
needs to store n x (k-1) extra information which magnifies 
the storage cost to a great extent. This is identified as one 
of the major drawback of any honeyword generation approach. 

(b) Co-relational hazard — If there exists a relationship 
between username and the password (e.g. username as 
football and password as maradona) then the original 
password of user can easily be identified from the list of Wi. 













In such cases honeywords can not mask the original password. 

(c) Distinguishable well-known password patterns — If 

user uses a password which is related to some well known 
object/fact, then attacker can easily identify the original 
password. For example, some of the passwords belong to this 
category are — bondOOV, jamesOO?, OOVbond and 007007 and 
were found from the list of 10000 most common passwords 

Q. 

(d) Issue related to DoS resistivity — If adversary 
can guess the honeywords while he knows the original 
password of user, then adversary can intentionally submit 
honeyword to generate a false negetive feedback signal by 
the “honeychecker" (while F is not compromised). Adversary 
can submit honeywords from many user accounts (either by 
creating them or, by knowing original password of users by 
shoulder surfing attack ifTTll l so that system understands the 
password file F has been compromised when it is actually 
not. If system senses submission of honeywords from too 
many accounts then system may block the whole web server. 
This is known as Denial-of-Service (DoS) attack a. Thus, 
original password of user must not give any idea about system 
generated honeywords to avoid DoS attack. Some of the 
honeyword generation techniques like — chqffing-by-tweaking 
digits M provide weak security against such kind of attack 
while some others like — modelling-syntax-approach m 
provide strong security against DoS. 

(e) Issue related to Multiple System Vulnerability — 

If a user uses same password in two (or more) different 
systems (where systems are using same honeyword generation 
algorithm) and an adversary gets access to both the systems, 
then Multiple System Vulnerability may occur. In this case, 
adversary may obtain obtains two lists of Wi for user Ui. Let 

denotes list of sweetwords for user m in the system 
Sj. Now if generated honeywords belong to ’’ and ’ 
(where p 7 ^ q) are different (probability of which is close 
to 1 ) then by performing intersection operation wf’’ n wf'’ 
adversary obtains the original password. This is identified as 
Multiple System Vulnerability (MSV) of honeyword based 
authentication technique. 

(f) Issue regarding Typo safety — A honeyword generation 
technique is called typo safe if typing mistake of users during 
entering of the password does not get match with any of 
the honeywords. “Chaffing-by-tweaking" 01 methods are not 
much typo safe as a legitimate user may accidentally submit 
a honeyword. Consider the following example where user 
chooses his password as roadS. Now “chaffing-by-tweaking- 
digits" may produce following list of sweetwords for k = 6 

road9 road2 roadS roadS road4 road6 

From the above list of sweetwords it can be seen that, 
honeywords are created from user password roadS by replacing 
the single digit. So the probability that user typing mistake will 


Enter Username : Alice 

Enter Password Choice : ;+=***** 

Append 613 to complete your password 

Enter Revised Password : ;+=******** 


Fig. 1: Registration interface of take-a-tail 


match with a honeyword for k = 6 is 5/9 (if user mistakenly 
enters a wrong digit while prefix of the password (here road) 
remains same). 

“Take-a-tail" method proposed by fuels and Rivest, suc¬ 
cessfully addresses all the above mentioned drawbacks except 
storage overhead. A brief overview and limitation of “take-a- 
tail" is presented next before we go into details of the proposed 
PDF protocol. 

C. Take-a-tail : An overview and limitation 

Using “take-a-tail" user first provides the username and his 
password choice during the course of registration to a system. 
System then generates a random string of length €(> 0) 
consisting of alphabets and (or) digits, identified as tail in m. 
While login to the system, user requires to submit username 
and password along with the system generated tail. Thus, the 
user registration interface can be described as shown in Fig. 

[U 

The example in Fig{T] shows that system generates 613 as 
a tail. During each login session, user requires to submit the 
password along with the appended tail. System generates the 
sweetwords by the “chaffing-by-tweaking" tail. Thus, for the 
password streetQlZ (613 is tail here) the following probable 
list of sweetwords for k = 5 is — 

streetl24 street498 street668 street613 streetl53 


Thus, even if there exist a co-relational hazard, adversary can 
hardly distinguish the user’s original password from the list 
of sweetwords. As tails differ for each login account of a 
user thus, MSV is also avoidable even if user chooses the 
same password. Knowing the original password (along with 
tail) also makes it difficult for the adversary to guess the 
honeywords and as a result provides a standard security against 
DoS attack. Thus, with some careful observation it is easily 
understandable that “take-a-tail" overcomes all the drawbacks 
mentioned in Section II-B except the storage overhead. 

As discussed earlier, the limitation of this approach remains 
in remembering n different system generated tails for n 
different login accounts and this reduces the usability factor 
to a great extent. Other than this, “take-a-tail" imposes same 
storage overhead as other honeyword generation approaches. 
Next we introduce the proposed PDP — a storage optimized 
honeyword generation approach with enhanced usability factor. 


III. Proposed methodology 

Our proposed approach is identified as Paired Distance 
Protocol or PDP. Using the proposed approach user needs to 
provide three information — (a) Username (b) Password and 
(c) a Random String RS of length f — containing alphabet 




and numbers of user’s own choice. The default length of RS 
is set as 3. Thus, along with password, user has to remember 
another secret information as RS. Initially, it may appear to 
be an overhead but RS provides several advantages which we 
have discussed elaborately in the subsequent sections. Few 
important characteristics of RS are discussed next. 

Using PDF user can use the same RS for different systems. 
However, users are strongly advised to choose a random 
RS (e.g. not a dictionary word). If chosen RS by user is 
hard to guess and doesn’t follow either a specific pattern 
(e.g. sequential keystroke) or, dictionary word (e.g. fox) and 
there is no cotTelation with either username or password (e.g. 
username — jeri'y, password — face and RS — eye) — then 
randomness of RS is considered as high. No element in RS 
should get repeated to avoid DoS attack (see detail in Section 

[IV^ . 

Our assumption behind setting up high randomness of string 
RS by user is valid because remembering RS doesn’t impose 
much overhead on users as — 

• The string length of RS is less (considered as 3 to avoid 
specific pattern like date of birth). 

• User may use same RS for different login accounts. 

The registration interface using PDF can be described by 

the Fig. 

Enter Username : Alice 

Enter Password Choice : ****** 

Choose a random string 
to complete your password 
Enter Revised Password : ********* 


Fig. 2: Registration interface of PDP 

The fundamental differences between “take-a-tail" and pro¬ 
posed approach — from usability perspective, is presented in 
Table HIl 


Take-a-tail 

PDP 

User remembers 
the extra information 

generated by system 

User remembers 
the extra information of 

his own choice 

For n different 
accounts, user must 
remember n information 

For n different 
accounts, user may 
remember single information 


TABLE II; Differences between “take-a-tail" and PDP from 
usability perspective 

Next we elaborate on how honeyword can be generated by 
using our proposed approach? 

A. Setting up the honey circular list 

Recently proposed Sauth approach m shows how different 
web-servers can collaborate to achieve a high security stan¬ 
dard. We also follow the same principle to get rid of MSV 
issue of honeyword based authentication scheme. First of all a 
circular list — identified as honey circular list or hcl of length 
|hcl| is created which holds the alphabet and digits in random 



Fig. 3: Honey Circular List : Contains alphabets and digits in 
random order 

order. The default value of |hcl| is considered as 36 here. For 
default value of |hcl| we show one instance of hcl in Fig. 

This hcl is then securely distributed to m different sys¬ 
tems (may he facebook, google etc.), participating in creating 
honeywords using PDP protocol. The hcl is maintained in 
the password file F. The utility of hcl for avoiding DoS is 
elaborated in Section IIV-BI 

B. Maintaining the user database 

While maintaining the user’s login information in the 
database, system does the following tricks. System first stores 
username along with the password (may be in the hash 
format) of user. System then measures the distance between 
the consecutive elements of RS with respect to the elements 
stored in hcl. The distance between any two elements of the 
hcl is known as paired distance and is defined as follows — 

Definition 1: Paired distance: Paired distance between two 
elements ei and 62 , denoted as Pr(ei,e 2 ) — is the number of 
cells that has to be traversed in clock wise direction in the 
honey circular list to reach from element ei to element 62 . 

Suppose user chooses RS as “tp7" then paired distance can 
be calculated as Pr(t,p) = 35 and Pr(p,7) = 6 . Along with 
username and password, system stores this paired distances 
between two consecutive elements of RS separated by " 
(e.g. 35 - 6 ). 

Definition 2: Distance chain : Distance chain is the set of 
n-1 paired distances (separated by “ — ”) between every two 
consecutive elements of RS, having length n. 

Along with username and password, (instead of storing k- 
1 honeywords) system maintains the distance chain derived 
from RS in password file F. While analysing, we have found 
a special property of distance chain and we identified it as 
uniqueness property which is defined next. 

Uniqueness property of distance chain : Given a hcl and 
a particular distance chain — RS can be uniquely derived if 
first element of RS is known. 

Let’s describe this with the previous example. Suppose 
distance chain 35 — 6 is known along with the first element of 
RS which is “t". Now with respect to the hcl shown in Fig. 
string “tp7" can be derived by performing reverse calculation 















which is unique. Now if hrst element of RS is unknown then 
starting with each element of hcl a given distance chain can 
be derived. For example, if distance chain is 35 — 6 then by 
reverse calculation, string “k8b", “ekx" etc can be derived 
using the hcl shown in Fig. Thus for a given distance chain 
total number of possible RS is |hcl|. 

Necessity of choosing RS : The strength of PDF depends on 
randomness of string RS. The honeywords are generated using 
the string RS and hcl. While analysing, we found that normal 
tendency of users is to use meaningful phrase in their password 
0 (e.g. secretl23, where secret is meaningful). From the 
distance chain stored by the system, adversary becomes able 
to derive different possible strings, which also contain the 
chosen RS by user. If user password is used in place of RS 
for generating the distance chain then adversary may able to 
distinguish user’s original password. The reason behind this 
is — due to random organization of characters in hcl, the 
probability of deriving a string — containing a meaningful 
phrase (except the original password of user), is very less. 
Now as meaningful phrase is used in most of the passwords, 
created by user thus, from the derived strings from the hcl 
and the distance chain, adversary can easily distinguish user’s 
original password from the non-meaningful derived strings. 
Hence there is a necessity of choosing RS which is random 
enough and can’t be easily guessed by the adversary. 

C. Maintaining the honeychecker 

In existing approaches, generally honey-checker maintains 
the index of original password of the user along with user- 
name. In the proposed approach, along with username, “hon¬ 
eychecker" maintains first character of the RS, chosen by the 
user. 

D. Working principal of the proposed scheme 

During login, when user submits the login credentials, 
system first checks the correctness of the password entered 
by the user. If the password entered by the user is incorrect 
then system straightaway denies the user login. 

If password entered by the user is correct then system 
derives the distance chain from the submitted RS. If derived 
distance chain doesn’t get matched with the stored distance 
chain in the password file F then system denies the user. 

If derived distance chain gets matched with stored one, 
system then communicates the first element of RS (submitted 
by the user) to the “honeychecker". If first element of RS 
submitted by the user gets matched with the element stored 
in “honeychecker" then “honeychecker" directs positive feed¬ 
back to the administrator otherwise “honeychecker" directs a 
negative feedback by detecting the attack. 

E. Evaluating the probability of detecting the attack 

Instead of storing k-1 extra information PDF is just storing 
one extra information as distance chain. Now there can be |hcl| 
number of probable RS corresponding to a distance chain. 
Thus, by storing a single information, systen confuses the 
attacker among |hcl| different possibilities. For default value 


of I hcl I = 36, the attack can be detected with the probability 
of 35/36 (or, 97.23% chances). 

F. Password meter 

Password meter shows how random RS is? If randomness 
of the RS is high then password meter shows strong signal 
otherwise, it shows weak signal. Below we show some of the 
instances of choice of RS for which randomness is low — 

• RS is concatenated with user password and if it makes 
some dictionary word (e.g. password — rah, RS — bit). 

• If RS itself is a dictionary word (e.g. fox). 

• RS follows a specific pattern (like, sequential keystroke), 
distinguishable by attacker. 

Users are recommended to change their RS if password meter 
shows low randomness. If there exist a co-relation among 
username, password and RS then also randomness of RS 
becomes low, though password meter is not able to address 
that. 

IV. Security standards 

There are three well defined security parameters for evaluat¬ 
ing the robustness of any honeyword generation algorithms — 
(a) Flatness (b) DoS resiliency and (c) Security against MSV. 
Next we will evaluate the strength of PDP by considering 
these three security standards along with a new security factor 
termed as collaborative security. 

A. Flatness 

If system maintains k sweetwords against a user Ui then 
attacker may get confused among k possible options once 
Wi is compromised. Now sometimes it may happen that, 
adversary can easily identify the password chosen by the user 
from the list Wi (e.g. if there exists a correlation between 
username and password). A honeyword generation algorithm 
is said to be perfectly-flat if adversary has no advantage 
while identifying the user’s original password from the list 
of Wi. If the honeyword generation algorithm is perfectly- 
flat then probability of selecting the original password of 
user from list Wi is 1/fc. If the probability of selecting user 
password from the list Wi is slightly greater than Xjk, then 
the honeyword generation algorithm is called approximately- 
flat. A good honeyword generation algorithm is required to be 
perfectly-flat. 

PDP becomes a perfectly-flat technique if the randomness 
of chosen RS is high. 

B. DoS resiliency 

Performing DoS attack (discussed in Section [Tl-B| i is highly 
impossible on a PDP secure system. DoS attack is only 
possible if adversary can generate a distance chain that is 
maintained by the system for any different RS not chosen by 
user. As RS not allows repetition of characters thus, adversary 
requires the knowledge of orientation of characters in the hcl 
to perform the attack. For example, if RS allows repetition of 
characters then adversary may create a distance chain made 
from characters RRR and while login, adversary may submit 




RS as SSS to perform DoS attack. This is because both the RS 
derive same distance chain as 0 — 0 but first character stored 
in “honeychecker" (here R) mismatches with first character of 
submitted RS (here S). Hence adversary becomes successful 
to accomplish the DoS attack. 

As all the elements in RS get differ from each other thus, 
without knowing the orientation of characters in the hcU the 
probability of generating a given distance chain by submitting 
a RS (which is not chosen by user) can be calculated by 
Equation [T] 


\hcl\ — 1 X 

i=0 



( 1 ) 


For the default values of parameters (£ = 3 and |hcl| = 36) 
the probability of successful DoS attack becomes 0.81 x 10“^, 
which is very less. 


C. Security against MSV 

A user may use same password in Z (> 1) different 
systems which use the same honeyword generation algorithm. 
Now if two such different systems are compromised then 
adversary can get the original password of user by performing 
an intersection operation. This is because, for a given pass¬ 
word, a honeyword generation algorithm produces different 
honeywords at each run with very high probability (close to 1) 
na 0 . Thus, for a given password, a honeyword generation 
algorithm produces different honeywords for each system. 

Now if PDF is adopted by Z different systems which secretly 
share a hcl, then for a given RS all the system generated 
distance chains will be same. Thus, even if two different 
accounts of a user (using the same password) are compromised 
— then also MSV will not occur. 

Another important observation is that, by identifying user 
Ui’s login credentials in a system, adversary would not be 
able to guess the password or RS used by Ui in other accounts 
unless both password and RS are same. 


D. Collaborative security 

In PDP approach, if a system senses that the hcl has 
been compromised (after “honeychecker" generates negative 
feedback for E(> 1) users accounts) then it will broadcast a 
security message {sm) to all other systems — generating hon¬ 
eywords, by using the same hcl. Once such sm is received by 
all the systems, a new hcl is being generated (with a different 
orientation of same set of characters) by the compromised 
system and is received by all the systems under this PDP 
approach. In Fig. we give a pictorial overview on how hcl 
is being shared among different systems? 

After receiving the new hcl, each system does the following 
things — 

• From the first character of RS stored in the “honey¬ 
checker" and with the reference of distance chain and 
previous hcl, system generates (and temporary stores) RS 
for each user. For example, after receiving t as the first 
character of RS from “honeychecker" system can derive 


Sys 1 Sys 2 Sys3.Sys n-1 Sys n 


sm.2 


sm.2 

sm.2 

sm.2 





ack.l 

ack.3 

ack.rv-1 

ack.n 








u-hd.2 

u-1kI.2 

u-hcl.2 

u-hcl.2 








ack.l 

.♦ 

ack.3 

4 - 

ack.r)-l 

ack.n 











Fig. 4; Above figure shows PDP protocol is used by n different 
systems by using same hcl. Vertical dotted line indicates the 
compromised system, sm.i indicates security message from 
system, ack.i denotes the acknowledgement generated by 
jth system, u-hcl.i indicates the updated hcl generated by 
system. The compromised system generates updated hcl after 
it receives acknowledgement from all other n-1 systems. 


the complete RS as tp7 with the help of distance chain 
35 — 6 and with the hcl shown in Fig. 

• After deriving the string RS for each user, the previous 
hcl is replaced by new hcl with the different orientation 
of characters. 

• By calculating the paired distance between the consecu¬ 
tive characters of RS from the new hcl, system derives 
the new distance chain for each user and communicates 
the first character of RS to “honeychecker". 

• After setting the complete password information, system 
then removes the stored RS for each user. 

Thus, by setting the high randomness of string RS, user may 
set a high security standard in terms of flatness, DoS resistivity 
and security against MSV. 

V. Usability standards 

The usability standard, set by a honeyword generation 
approach can be measured in terms of three parameters — 
(a) Typo safety (b) System interference and (c) Stress on 
memorability. Each of these are discussed next. 

A. Typo safety 

A honeyword generation algorithm is called typo safe if 
typing mistake of users doesn’t lead to generate a negative 
feedback signal by honey-checker. Using PDP, honey-checker 
generates a negative feedback signal only if the string other 
than RS derives a distance chain that gets matched with the 
stored distance chain. While typing the RS, user can enter 
either (a) sub part of RS as wrong or, (b) all the elements 
of RS as wrong. If user enters sub part of RS as wrong (e.g. 
instead of tp7, if he enters tp8) then it will never evaluate 
a distance chain which gets matched with the stored one. If 
user enters all the elements of RS wrong (which may rarely 
happen) by typing mistake, the probability that a same distance 




























Honeyword 

method 

Flatness 

DoS 

Resiliency 

Security 
against MSV 

System 

interference 

Typo 

safety 

Stress on 
memorability 

Storage 

overhead 

CTD 

1/k if U « G 

low 

low 

no 

low 

low 

k-1 

modelling -syntax 

l/k if U w G 

high 

low 

no 

high 

low 

k-1 

take-a- tail 

1 /fc (unconditionally) 

low 

high 

high 

high 

high 

k-1 

PDP 

1 /fc ® 

high 

high 

low 

high 

low 

1 


TABLE III: Comparative usability analysis of honeyword generation methods. U « G indicates, if honeywords are distributed 
like user chosen password from the adversary point of view. @ indicates if randomness of RS is high. Storage overhead shows 
the extra information system has to store in password file F. 


chain (liked stored one) will be generated, can be derived by 
Equation]^ same as Equation [T] 


Prob = \hcl\ — lx 


i-\ 


E 


1 

\hcl — i\ 


( 2 ) 


Eor the default values of parameters Equation can be 
evaluated as 0.81 x 10“^ which is significantly less. Thus, 
PDP is highly typo safe. 


B. System interference 

System interference of a honeyword system reflects — 
how much a honeyword generation algorithm influences the 
password choice of the user? If user needs to adjust his 
password according to the honeyword generation policy of the 
system then there exists system interference. Here we define 
three level system interference — a) High system interference 

— where user needs to adjust/manipulate his password choice 
based on the parameter value provided by system. Like in 
“take-a-tail" m, choosing a three digit tail may be considered 
as a parameter whereas “635" may be considered a valid 
parameter value set by the system. This value is needed to be 
remembered by the user, b) Low system interference — where 
user sets the value of the parameter to manipulate his password 
choice. The proposed method PDP, requires to set the RS 
(considered as a parameter) where the parameter value (i.e. 
elements of RS) is chosen by the user. We consider this as low 
system interference as system gives the opportunity to user to 
choose the value of his own choice, c) No system interference 

— where user needs not to manipulate his password choice. 


C. Stress on memorability 

There exists a relation between system interference and 
stress on memorability. If system interference of a honeyword 
scheme is high, then user has to remember different system 
generated information for different login accounts. These 
increase the stress on memorability. On the other hand, using 
a honeyword generation scheme having low/no system inter¬ 
ference, user may use the same login credential for different 
login accounts. Thus, stress on memorability becomes low 
in this case. Proposed PDP approach imposes low stress on 
memorability because of it’s low system interference. 


VI. Storage cost 

Using the previously proposed honeyword generation algo¬ 
rithms system maintains k-1 extra passwords along with the 
original password of user, in the password file F. On the other 
hand index of the original password of the user is maintained 
in “honeychecker" server. If we assume that for storing a single 
password, system requires 0 memory space then for storing 
password information of n users, would require nOk space. 
Whereas the required space in “honeychecker" is n6. Using 
PDP, for each user, system stores two information (password 
and distance chain). Thus, the password storing cost for n 
users in password file F becomes 2n9. Though, PDP maintains 
a hcl of size |hcl|0 — but the storage cost does not depend 
on number of users. So storage cost of hcl is negligible. As 
memory cost of storing an index value in the “honeychecker" 
is very similar with storing a digit/alphabet so, required space 
in “honeychecker" is same as n9. Thus, PDP saves a memory 
overhead n9(k-2) . 

As any standard honeyword system maintains the value of 
k as 20 for moderate detection rate M thus PDP saves a 
memory overhead 18n9 which is a huge benefit. 

VII. Comparative analysis 

In the scope of this section, we compare PDP with some of 
the recently proposed honeyword generation approaches (a) 
Chajfing-by-tweaking-digits (CTD) II14L (b) Take-a-tail II14II 
and (c) Modelling-syntax-approach 0, in terms of security 
and usability standards (shown in Table 

Above table shows that by choosing a high random RS, 
user can avail the same security standard as “take-a-tail" in 
terms of providing security with respect to Flatness and MSV. 
The limited strength of “take-a-tail" in term of providing 
security against DoS attack m has also been overcome 
by PDP approach. Thus, depending upon the randomness 
of RS, PDP ensures the highest level of security standard. 
Prom the usability perspective, PDP significantly raises the 
bar compared to “take-a-tail" in terms of system interference 
and stress on memorability and makes PDP highly practical 
approach to be used by common users. Most importantly, PDP 
reduces the storage overhead compared to all existing security 
approaches — by storing a single information which is a huge 
benefit. 

















VIII. Related work 

The modern password cracking algorithm uses the concept 
of probabilistic context free grammars Ea. In na, Kelley 
et al. characterizes the vulnerability of the passwords under 
the same threat model ll22l by considering different password- 
compositions policies. One of such weak password composi¬ 
tion policy is “basicS" in which users are instructed “Password 
must have atleast 8 characters". One billion guess is sufficient 
to guess 40.3% of such passwords. In |[3], authors show that by 
using a single graphical processing unit, three billion guesses 
per second can be achievable to crack the hash functions like - 
MD5. Among the 70 million yahoo users it has been observed 
that majority of the passwords are having little more than 20 
bits of effective entropy ifT^ against an optimal attacker ||6l. 
The honeyword scheme gives tremendous support to the con¬ 
ventional password scheme in terms of providing security and 
can be incorporated with the conventional password system. 
To the best of our knowledge, in 2006 Fred Cohen has made 
the first contribution in this domain 0. There after many 
methodologies have been proposed in this direction. The idea 
has been deployed to many password related domains. Herley 
and Florencio 03 use this concept to protect online banking 
accounts from brute-force attack. Bojinov et al. propose the 
concept of “Kamouflage" where real password of the user is 
stored along with the fake passwords but this does not include 
the concept of “honeychecker" server 0 . Later in lfT4l . au¬ 
thors introduce the concept of “honeychecker" server to detect 
the password cracking mechanism. Recently Chakraborty and 
Mondal show how honeywords can be used to detect shoulder 
surfing attack El. 

IX. Conclusion 

Honeyword based techniques are getting popular as it 
provides several advantages over traditional password based 
schemes. However, the storage cost is one of the major 
overhead of honeyword based schemes. In this paper we 
have proposed a novel honeyword generation approach which 
reduces the storage overhead and also it addresses majority of 
the drawbacks of existing honeyword generation techniques. 
The only shortfall of PDF is, user has to remember an extra 
information in terms of RS. In future we would like to analyse 
the possibility of developing a honeyword generation technique 
without remembering any extra information by the users. 

References 

[1] John the ripper password cracker [online document] [cited 2008 oct 07] 
available http, http://www.openwall.com 


[2] M. H. Almeshekah, E. H. Spafford, and M. J. Atallah. Improving 
security using deception. Technical report, Technical Report CERIAS 
Tech Report 2013-13, Center for Education and Research Information 
Assurance and Security, Purdue University, 2013. 

[3] M. Bakker and R. Van Der Jagt. GPt/-based password cracking. 
University of Amsterdam, System and Network Engineering, Amsterdam, 
Research, 2010. 

[4] J. E. Belissent. Method and apparatus for preventing a denial of service 
{DOS) attack by selectively throttling tcp/ip requests, Sept. 7 2004. US 
Patent 6,789,203. 

[5] H. Bojinov, E. Bursztein, X. Boyen, and D. Boneh. Kamouflage: Loss- 
resistant password management. In Computer Security-ESORICS 2010, 
pages 286-302. Springer, 2010. 

[6] J. Bonneau. The science of guessing: analyzing an anonymized corpus 
of 70 million passwords. In Security and Privacy (SP), 2012 IEEE 
Symposium on, pages 538-552. IEEE, 2012. 

[7] M. Burnett. 10000 top passwords, https://xato.net/passwords/more-top- 
worst-passwords/^^. VEzgQyKUc 10. 

[8] N. Chakraborty and S. Mondal. Tag digit based honeypot to detect 
shoulder surfing attack. In Security in Computing and Communications, 
pages 101-110. Springer, 2014. 

[9] F. Cohen. The use of deception techniques: Honeypots and decoys. 
Handbook of Information Security, 3:646-655, 2006. 

[10] I. Erguler. Achieving flatness: Selecting the honeywords from existing 
user passwords. 

[11] C. Gaylord. Linkedin, last, fm, now yahoo? don’t ignore news of a 
password breach. Christian Science Monitor, 13, 2012. 

[12] D. Gross. 50 million compromised in evernote hack. CNN, March 2013. 

[13] C. Herley and D. Florencio. Protecting financial institutions from brute- 
force attacks. In Proceedings of The Ifip Tc 11 23rd International 
Information Security Conference, pages 681-685. Springer, 2008. 

[14] A. Juels and R. L. Rivest. Honeywords: Making password-cracking 
detectable. In Proceedings of the 2013 ACM SIGSAC conference on 
Computer & communications security, pages 145-160. ACM, 2013. 

[15] P. G. Kelley, S. Komanduri, M. L. Mazurek, R. Shay, T. Vidas, L. Bauer, 
N. Christin, L. F. Cranor, and J. Lopez. Guess again (and again and 
again): Measuring password strength by simulating password-cracking 
algorithms. In Security and Privacy (SP), 2012 IEEE Symposium on, 
pages 523-537. IEEE, 2012. 

[16] G. Kontaxis, E. Athanasopoulos, G. Portokalidis, and A. D. Keromytis. 
Sauth: Protecting user accounts from password database leaks. In 
Proceedings of the 2013 ACM SIGSAC conference on Computer & 
communications security, pages 187-198. ACM, 2013. 

[17] T. Kwon, S. Shin, and S. Na. Covert attentional shoulder surfing: 
Human adversaries are more powerful than expected. Systems, Man, 
and Cybernetics: Systems, IEEE Transactions on, 44(6):716-727, 2014. 

[18] J. Ma, W. Yang, M. Luo, and N. Li. A study of probabilistic password 
models. In Security and Privacy (SP), 2014 IEEE Symposium on, pages 
689-704. IEEE, 2014. 

[19] W. Ma, J. Campbell, D. Tran, and D. Kleeman. Password entropy and 
password quality. In Network and System Security (NSS), 2010 4th 
International Conference on, pages 583-587. IEEE, 2010. 

[20] R. Marois and J. Ivanoff. Capacity limits of information processing in 
the brain. Trends in cognitive sciences, 9(6):296-305, 2005. 

[21] N. Provos and D. Mazieres. A future-adaptable password scheme. In 
USENIX Annual Technical Conference, FREENIX Track, pages 81-91, 
1999. 

[22] M. Weir, S. Aggarwal, B. De Medeiros, and B. Glodek. Password 
cracking using probabilistic context-free grammars. In Security and 
Privacy, 2009 30th IEEE Symposium on, pages 391-405. IEEE, 2009. 


